Inferential Statistics: t-Test

Complete Course of Mathematics
Topic 1: Numbers & Numerical Applications	Topic 2: Algebra	Topic 3: Quantitative Aptitude
Topic 4: Geometry	Topic 5: Construction	Topic 6: Coordinate Geometry
Topic 7: Mensuration	Topic 8: Trigonometry	Topic 9: Sets, Relations & Functions
Topic 10: Calculus	Topic 11: Mathematical Reasoning	Topic 12: Vectors & Three-Dimensional Geometry
Topic 13: Linear Programming	Topic 14: Index Numbers & Time-Based Data	Topic 15: Financial Mathematics
Topic 16: Statistics & Probability

Content On This Page
t-Distribution: Properties (Implicit)	t-Test (one sample t-test): Procedure and Application	t-Test (two independent groups t-test): Procedure and Application

t-Distribution: Properties (Implicit)

Background: When the Population Standard Deviation is Unknown

In hypothesis testing and confidence interval estimation for a population mean ($\mu$), if the population standard deviation ($\sigma$) is known, and the population is normally distributed or the sample size is large ($n \ge 30$) due to the Central Limit Theorem, the test statistic used ($\frac{\bar{x} - \mu}{\sigma/\sqrt{n}}$) follows a standard normal distribution (Z-distribution).

However, the population standard deviation ($\sigma$) is frequently unknown in real-world applications. In such cases, we must estimate it using the sample standard deviation ($s$), calculated from the sample data.

When we replace the known $\sigma$ with the estimated sample standard deviation $s$ in the test statistic formula ($\frac{\bar{x} - \mu}{s/\sqrt{n}}$), the resulting statistic introduces additional variability. This is because the sample standard deviation $s$ itself is a random variable that varies from sample to sample, unlike the fixed population parameter $\sigma$. Due to this extra variability, especially in smaller samples, the distribution of the test statistic is no longer the standard normal distribution.

Instead, the test statistic $\frac{\bar{x} - \mu}{s/\sqrt{n}}$ follows a different distribution called the **t-distribution** (or Student's t-distribution). This distribution was first described by William Sealy Gosset in 1908, who published under the pseudonym "Student".

Properties of the t-Distribution

The t-distribution is a family of probability distributions that share some similarities with the standard normal distribution but have key differences:

Shape:

Like the standard normal distribution, the t-distribution is symmetric, bell-shaped, and centered at 0. Its overall form resembles the Z-distribution.
Parameter: Degrees of Freedom (df):

Unlike the single standard normal distribution, there is a unique t-distribution for each positive integer value of a parameter called **degrees of freedom (df)**. The degrees of freedom are related to the sample size $n$. For a one-sample t-test for a mean, the degrees of freedom are $df = n-1$. The shape of the t-distribution depends on its degrees of freedom.
Tails and Variability:

Compared to the standard normal distribution, the t-distribution has **heavier tails** (more probability in the tails) and is **more spread out** (greater variance). This accounts for the extra uncertainty introduced by estimating $\sigma$ with $s$. The lower the degrees of freedom (i.e., smaller sample size), the heavier the tails and the flatter and wider the distribution. This reflects greater uncertainty when the sample standard deviation is based on less data.
Approximation to the Normal Distribution:

As the degrees of freedom ($df$) increase, the t-distribution becomes increasingly similar to the standard normal (Z) distribution. As $df \to \infty$, the t-distribution converges to the standard normal distribution. For practical purposes, when $df$ is large (commonly cited thresholds are $df > 30$ or $df > 100$), the t-distribution is very close to the Z-distribution, and sometimes the Z-test is used as an approximation to the t-test for large samples.

Because of these properties, when $\sigma$ is unknown, we use t-distribution tables or software to find critical values and p-values for hypothesis tests and confidence intervals involving the mean, using the appropriate degrees of freedom $(n-1)$.

t-Test (one sample t-test): Procedure and Application

Purpose

The **one-sample t-test** is a statistical hypothesis test used to determine whether the mean of a single sample is significantly different from a known or hypothesized population mean ($\mu_0$) when the population standard deviation ($\sigma$) is **unknown** and must be estimated from the sample data using the sample standard deviation ($s$).

This test is widely applied in various scenarios where you want to compare an observed sample mean to a target value, standard value, or a previously known population mean, but you do not know the true variability ($\sigma$) of the population.

Examples:

Does the average weight of a sample of students differ significantly from a claimed average weight for all students nationwide?
Is the mean processing time for a sample of tasks from a new system less than the known average processing time of an old system?
Does the average breaking strength of a sample of ropes meet the manufacturer's specified minimum average strength?

Assumptions for the One-Sample t-Test

For the results of a one-sample t-test to be valid, certain assumptions about the data should be met:

**Random Sample:** The sample should be a simple random sample drawn from the population of interest. This helps ensure representativeness and independence of observations.
**Normality (or Large Sample Size):** The variable being measured in the population should be approximately normally distributed. If the population is not normal, the assumption can still be reasonably met if the **sample size is sufficiently large** (often cited as $n \ge 30$), due to the Central Limit Theorem, which ensures that the sampling distribution of the sample mean is approximately normal regardless of the population distribution shape. For small sample sizes ($n < 30$), the normality assumption is more critical.
**Independence of Observations:** The observations within the sample must be independent of each other. (Ensured by random sampling).

The t-test is considered relatively **robust** to moderate violations of the normality assumption, especially as the sample size increases.

Procedure for Conducting a One-Sample t-Test

Conducting a one-sample t-test follows the general steps of hypothesis testing:

State the Hypotheses:

Define the null ($H_0$) and alternative ($H_a$) hypotheses about the population mean $\mu$. $\mu_0$ represents the hypothesized value of the population mean specified in the null hypothesis.
- Null Hypothesis ($H_0$): The population mean is equal to the hypothesized value. $H_0: \mu = \mu_0$. (Sometimes $H_0: \mu \ge \mu_0$ or $H_0: \mu \le \mu_0$ for one-tailed tests, but $\mu = \mu_0$ is the value used in the test statistic).
- Alternative Hypothesis ($H_a$): The population mean is different from, greater than, or less than the hypothesized value. This determines the tail(s) of the test:
  - Two-tailed: $H_a: \mu \neq \mu_0$
  - Right-tailed: $H_a: \mu > \mu_0$
  - Left-tailed: $H_a: \mu < \mu_0$
Set the Criteria for Decision:

Determine the threshold for rejecting the null hypothesis.
- Choose the **Level of Significance ($\alpha$)**: Common values are 0.05 or 0.01.
- Determine the **Degrees of Freedom (df)**: For a one-sample t-test, $df = n - 1$, where $n$ is the sample size.
- Identify the **Critical t-value(s) ($t_{crit}$)**: Use the t-distribution table (or software) with the chosen $\alpha$ and $df$ to find the critical value(s) that define the rejection region. The location of the rejection region depends on $H_a$ (one tail or two tails).
Compute the Test Statistic:

Calculate the sample mean ($\bar{x}$) and sample standard deviation ($s$) from the collected data. Then, compute the value of the one-sample t-test statistic:

$$t_{calculated} = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}$$

... (1)

Where $\bar{x}$ is the sample mean, $\mu_0$ is the hypothesized population mean (from $H_0$), $s$ is the sample standard deviation, and $n$ is the sample size. The term $s/\sqrt{n}$ is the standard error of the mean when $\sigma$ is estimated by $s$.
Make a Statistical Decision:

Compare the calculated test statistic ($t_{calculated}$) with the critical value(s) ($t_{crit}$), or compare the p-value with $\alpha$.
- **Using the Critical Value Approach:** If $t_{calculated}$ falls into the critical (rejection) region defined in step 2, **reject $H_0$**. Otherwise, **fail to reject $H_0$**. (For a right-tailed test, reject if $t_{calc} \ge t_{crit}$; for a left-tailed test, reject if $t_{calc} \le t_{crit}$; for a two-tailed test, reject if $|t_{calc}| \ge |t_{crit}|$).
- **Using the P-value Approach:** Calculate the p-value associated with $t_{calculated}$ and $df$. If the p-value $\le \alpha$, **reject $H_0$**. If the p-value $> \alpha$, **fail to reject $H_0$**. (The p-value is the area in the tail(s) of the t-distribution beyond $t_{calculated}$).
Interpret the Conclusion:

State the decision in the context of the original problem and the research question, referring back to the hypotheses. Use non-technical language where appropriate.
- If $H_0$ was rejected: Conclude that there is statistically significant evidence (at the chosen $\alpha$ level) to support the alternative hypothesis ($H_a$) in the population.
- If $H_0$ was not rejected: Conclude that there is not sufficient evidence (at the chosen $\alpha$ level) to reject the null hypothesis ($H_0$). Avoid stating that $H_0$ is accepted.

Example

Example 1. A machine is supposed to fill bottles with an average of 500 ml of liquid. To check if it's working correctly, a quality control engineer takes a random sample of 10 bottles. The sample mean fill volume is 497 ml, and the sample standard deviation is 4 ml. At the $\alpha=0.05$ level of significance, is there evidence to suggest that the machine is underfilling the bottles on average?

Answer:

Given: Sample size $n=10$, sample mean $\bar{x}=497$ ml, sample standard deviation $s=4$ ml. Hypothesized population mean $\mu_0=500$ ml. Level of significance $\alpha=0.05$.

To Test: If the average fill volume is significantly less than 500 ml.

Solution:

Step 1: State the Hypotheses.

The claim is that the machine is underfilling, meaning the average fill volume is less than 500 ml.

Null Hypothesis ($H_0$): The machine is filling correctly on average (mean is 500 ml or more). $H_0: \mu = 500$ (or $\mu \ge 500$).
Alternative Hypothesis ($H_a$): The machine is underfilling on average (mean is less than 500 ml). $H_a: \mu < 500$. This indicates a **left-tailed test**.

Step 2: Set the Criteria for Decision.

Level of Significance: $\alpha = 0.05$.
Test Statistic: Since $\sigma$ is unknown and estimated by $s$, we use a one-sample t-test. The test statistic is $t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}$.
Sampling Distribution: t-distribution with $df = n - 1 = 10 - 1 = 9$.
Critical Region: For a left-tailed test with $\alpha = 0.05$ and $df = 9$, we look up the t-table. The critical t-value $t_{crit}$ such that $P(T \le t_{crit}) = 0.05$ with 9 degrees of freedom is approximately -1.833.

Critical Value $t_{crit} \approx -1.833$

... (ii)

Rejection Region: Reject $H_0$ if $t_{calculated} \le -1.833$.

Step 3: Compute the Test Statistic.

Using the given sample data ($\bar{x}=497$, $s=4$, $n=10$) and $\mu_0=500$ from $H_0$:

$$t_{calculated} = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}$$

... (iii)

$$t_{calculated} = \frac{497 - 500}{4 / \sqrt{10}}$$

$$t_{calculated} = \frac{-3}{4 / \sqrt{10}}$$

Calculate $\sqrt{10} \approx 3.162$.

$$t_{calculated} \approx \frac{-3}{4 / 3.162} \approx \frac{-3}{1.2648}$$

$$t_{calculated} \approx -2.372$$

... (iv)

Step 4: Make a Statistical Decision.

Compare $t_{calculated} \approx -2.372$ with the critical value $t_{crit} = -1.833$.

Since $-2.372$ is less than $-1.833$ (i.e., $-2.372$ falls into the critical region on the left tail of the t-distribution), we **reject $H_0$**.

(Using the p-value approach: The p-value for $t=-2.372$ with $df=9$ is the area to the left of -2.372. Looking up a t-table or using software, this p-value is approximately 0.021. Since $0.021 \le 0.05$ (our $\alpha$), we reject $H_0$).

Step 5: Interpret the Conclusion.

At the 0.05 level of significance, there is statistically significant evidence from the sample to conclude that the average fill volume of the bottles from this machine is less than 500 ml. This suggests the machine is underfilling.

t-Test (two independent groups t-test): Procedure and Application

Purpose

The **independent samples t-test**, often simply called the **two-sample t-test**, is a statistical hypothesis test used to determine if there is a statistically significant difference between the **means of two independent groups**. This test is used when the population standard deviations ($\sigma_1$ and $\sigma_2$) are unknown and are estimated using sample standard deviations ($s_1$ and $s_2$).

It is applicable when you have data from two separate, unrelated groups (meaning the selection of individuals in one group does not influence the selection of individuals in the other group) and you want to compare their average values for a quantitative variable.

Examples:

Is the average salary of men different from the average salary of women in a particular profession?
Does a new fertilizer increase the average crop yield compared to a standard fertilizer?
Is the mean lifespan of a component produced by Machine A significantly different from that produced by Machine B?

Assumptions for the Independent Samples t-Test

For the results of an independent samples t-test to be reliable, the following assumptions should be met:

**Independence of Samples:** The two samples must be independent. This means the selection of individuals in one sample does not affect the selection of individuals in the other sample.
**Random Samples:** Both samples should be simple random samples from their respective populations.
**Normality (or Large Sample Sizes):** The variable being measured should be approximately normally distributed in both populations. If the populations are not normal, the assumption can be relaxed if both sample sizes ($n_1$ and $n_2$) are sufficiently large (often $n_1 \ge 30$ and $n_2 \ge 30$ are used as guidelines), due to the Central Limit Theorem.
**Equality of Variances (Homoscedasticity):** The standard version of the independent samples t-test (often called the pooled t-test) assumes that the variances of the two populations are equal ($\sigma_1^2 = \sigma_2^2$). If this assumption is violated (heteroscedasticity), a modified version of the test, known as **Welch's t-test** (or the unequal variances t-test), should be used. There are statistical tests, like Levene's test, that can be performed to check the assumption of equality of variances.

Like the one-sample t-test, the independent samples t-test is reasonably robust to moderate violations of the normality assumption, especially with larger sample sizes. Violations of the independence or random sampling assumptions are more serious.

Procedure for Conducting an Independent Samples t-Test (Assuming Equal Variances)

The procedure follows the standard hypothesis testing framework:

State the Hypotheses:

Define the null ($H_0$) and alternative ($H_a$) hypotheses about the difference between the two population means, $\mu_1$ and $\mu_2$.
- Null Hypothesis ($H_0$): There is no difference between the population means. $H_0: \mu_1 = \mu_2$ (or equivalently, $H_0: \mu_1 - \mu_2 = 0$).
- Alternative Hypothesis ($H_a$): The population means are different, or one is greater than the other. This determines the tail(s) of the test:
  - Two-tailed: $H_a: \mu_1 \neq \mu_2$
  - Right-tailed: $H_a: \mu_1 > \mu_2$
  - Left-tailed: $H_a: \mu_1 < \mu_2$
Set the Criteria for Decision:

Determine the rules for rejecting the null hypothesis.
- Choose the **Level of Significance ($\alpha$)**: Typically 0.05 or 0.01.
- Calculate the **Pooled Variance Estimate ($s_p^2$)**: Since we assume equal population variances ($\sigma_1^2 = \sigma_2^2$), we combine the information from both sample variances ($s_1^2$ and $s_2^2$) to get a better estimate of the common population variance. The pooled variance $s_p^2$ is a weighted average of $s_1^2$ and $s_2^2$, weighted by their respective degrees of freedom $(n_i - 1)$:
  
  $$s_p^2 = \frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{(n_1 - 1) + (n_2 - 1)} = \frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}$$
  
  ... (1)
  
  (Where $s_1^2$ and $s_2^2$ are the sample variances calculated using $n_1-1$ and $n_2-1$ in their own denominators, respectively). The pooled standard deviation is $s_p = \sqrt{s_p^2}$.
- Determine the **Degrees of Freedom (df)**: For the independent samples t-test assuming equal variances, $df = n_1 + n_2 - 2$.
- Identify the **Critical t-value(s) ($t_{crit}$)**: Use the t-distribution table (or software) with the chosen $\alpha$ and $df$ to find the critical value(s) that define the rejection region, based on $H_a$.
Collect Sample Data and Compute Test Statistic:

Gather data from independent random samples for each group. Calculate the sample means ($\bar{x}_1, \bar{x}_2$) and sample standard deviations ($s_1, s_2$) or variances ($s_1^2, s_2^2$) for both groups. Then, compute the value of the two-sample t-test statistic:

$$t_{calculated} = \frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2)_0}{\sqrt{s_p^2 \left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$$

... (2)

Where $(\mu_1 - \mu_2)_0$ is the hypothesized difference between population means under $H_0$. In most tests of "no difference", this value is 0. So the formula simplifies to:

$$t_{calculated} = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$$

... (3)

Where $s_p = \sqrt{s_p^2}$. The denominator represents the pooled standard error of the difference between means.
Make a Statistical Decision:

Compare the calculated test statistic ($t_{calculated}$) with the critical value(s) ($t_{crit}$) from step 2, or compare the p-value with $\alpha$.
- **Critical Value Approach:** If $t_{calculated}$ falls in the critical (rejection) region defined by $t_{crit}$ and $H_a$, **reject $H_0$**. Otherwise, **fail to reject $H_0$**. (e.g., for a two-tailed test, reject if $|t_{calc}| \ge |t_{crit}|$).
- **P-value Approach:** Find the p-value associated with $t_{calculated}$ and $df$. If p-value $\le \alpha$, **reject $H_0$**. If p-value $> \alpha$, **fail to reject $H_0$**.
Interpret the Conclusion:

State the final decision and its meaning in the context of the problem, referring to the difference between the two population means ($\mu_1$ and $\mu_2$).
- If $H_0$ was rejected: Conclude that there is statistically significant evidence (at the chosen $\alpha$ level) of a difference (or difference in a specific direction) between the two population means.
- If $H_0$ was not rejected: Conclude that there is not sufficient evidence (at the chosen $\alpha$ level) to suggest a significant difference between the two population means.

Note: If the equal variances assumption is not met, use Welch's t-test. The main difference is in the formula for the standard error and the calculation of degrees of freedom (which becomes a more complex formula not typically needed in introductory contexts).

Example

Example 1. A teacher wants to compare the effectiveness of two different teaching methods, Method A and Method B, based on student test scores. A random sample of $n_1=12$ students taught by Method A has a mean score $\bar{x}_1 = 85$ with a sample variance $s_1^2 = 25$. A random sample of $n_2=10$ students taught by Method B has a mean score $\bar{x}_2 = 81$ with a sample variance $s_2^2 = 20$. Assuming the population variances are equal, perform a hypothesis test at the $\alpha=0.05$ level of significance to determine if there is a significant difference in the mean test scores between the two methods.

Answer:

Given: Two independent samples. Sample A: $n_1=12, \bar{x}_1=85, s_1^2=25$. Sample B: $n_2=10, \bar{x}_2=81, s_2^2=20$. Assume $\sigma_1^2 = \sigma_2^2$. $\alpha=0.05$.

To Test: Is there a significant difference between $\mu_1$ and $\mu_2$?

Solution:

Step 1: State the Hypotheses.

We are testing for *any* significant difference, so it's a two-tailed test.

Null Hypothesis ($H_0$): There is no difference in the true mean test scores between the two methods. $H_0: \mu_1 = \mu_2$.
Alternative Hypothesis ($H_a$): There is a difference in the true mean test scores between the two methods. $H_a: \mu_1 \neq \mu_2$.

Step 2: Set the Criteria for Decision.

Level of Significance: $\alpha = 0.05$.
Test Statistic: We use the two-sample t-test for independent groups, assuming equal variances. The test statistic is $t = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$.
Degrees of Freedom: $df = n_1 + n_2 - 2 = 12 + 10 - 2 = 20$.
Pooled Variance ($s_p^2$):

$$s_p^2 = \frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}$$

... (iv)

$$s_p^2 = \frac{(12 - 1)(25) + (10 - 1)(20)}{12 + 10 - 2} = \frac{11(25) + 9(20)}{20}$$

$$s_p^2 = \frac{275 + 180}{20} = \frac{455}{20} = 22.75$$

... (v)

Pooled standard deviation $s_p = \sqrt{22.75} \approx 4.7697$.
Critical t-values ($t_{crit}$): For a two-tailed test at $\alpha=0.05$ with $df=20$, the total $\alpha=0.05$ is split equally into both tails ($\alpha/2 = 0.025$ in each tail). We look up the t-table for $df=20$ and area in upper tail = 0.025. The critical value is approximately 2.086. Since it's two-tailed, the critical values are $\pm 2.086$.

Critical Values $t_{crit} \approx \pm 2.086$

... (vi)

Rejection Region: Reject $H_0$ if $t_{calculated} \le -2.086$ or $t_{calculated} \ge 2.086$ (i.e., $|t_{calculated}| \ge 2.086$).

Step 3: Compute the Test Statistic.

Using the sample data and the pooled standard deviation $s_p \approx 4.7697$:

$$t_{calculated} = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$$

... (vii)

$$t_{calculated} = \frac{85 - 81}{4.7697 \sqrt{\frac{1}{12} + \frac{1}{10}}}$$

$$t_{calculated} = \frac{4}{4.7697 \sqrt{\frac{10 + 12}{120}}}$$

$$t_{calculated} = \frac{4}{4.7697 \sqrt{\frac{22}{120}}} \approx \frac{4}{4.7697 \sqrt{0.1833...}}$$

$$t_{calculated} \approx \frac{4}{4.7697 \times 0.42818}$$

$$t_{calculated} \approx \frac{4}{2.0418}$$

$$t_{calculated} \approx 1.959$$

... (viii)

Step 4: Make a Statistical Decision.

Compare the calculated test statistic $t_{calculated} \approx 1.959$ with the critical values $t_{crit} = \pm 2.086$.

Since $-2.086 < 1.959 < 2.086$, the calculated t-statistic falls in the non-rejection region. (Or, $|1.959| = 1.959$, which is less than $|2.086| = 2.086$).

Therefore, we **fail to reject $H_0$**.

(Using the p-value approach: For $t=1.959$ with $df=20$, the two-tailed p-value is approximately 0.063. Since $0.063 > 0.05$ (our $\alpha$), we fail to reject $H_0$).

Step 5: Interpret the Conclusion.

At the 0.05 level of significance, there is not sufficient statistical evidence from the sample data to conclude that there is a significant difference in the true mean test scores between students taught by Method A and students taught by Method B.

t-Distribution: Properties (Implicit)

Background: When the Population Standard Deviation is Unknown

Properties of the t-Distribution

Shape:

Parameter: Degrees of Freedom (df):

Tails and Variability:

Approximation to the Normal Distribution:

t-Test (one sample t-test): Procedure and Application

Purpose

Assumptions for the One-Sample t-Test

Procedure for Conducting a One-Sample t-Test

State the Hypotheses:

Set the Criteria for Decision:

Compute the Test Statistic:

Make a Statistical Decision:

Interpret the Conclusion:

Example

t-Test (two independent groups t-test): Procedure and Application

Purpose

Assumptions for the Independent Samples t-Test

Procedure for Conducting an Independent Samples t-Test (Assuming Equal Variances)

State the Hypotheses:

Set the Criteria for Decision:

Collect Sample Data and Compute Test Statistic:

Make a Statistical Decision:

Interpret the Conclusion:

Example